Learning to Act Using Real-Time Dynamic Programming

نویسندگان

Andrew G. Barto

Steven J. Bradtke

Satinder P. Singh

چکیده

Learning methods based on dynamic programming (DP) are receiving increasing attention in arti cial intelligence. Researchers have argued that DP provides the appropriate basis for compiling planning results into reactive strategies for real-time control, as well as for learning such strategies when the system being controlled is incompletely known. We introduce an algorithm based on DP, which we call Real-Time DP (RTDP), by which an embedded system can improve its performance with experience. RTDP generalizes Korf's Learning-Real-Time-A* algorithm to problems involving uncertainty. We invoke results from the theory of asynchronous DP to prove that RTDP achieves optimal behavior in several di erent classes of problems. We also use the theory of asynchronous DP to illuminate aspects of other DP-based reinforcement learning methods such as Watkins' Q-Learning algorithm. A secondary aim of this article is to provide a bridge between AI research on real-time planning and learning and relevant concepts and algorithms from control theory. 1

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A DSS-Based Dynamic Programming for Finding Optimal Markets Using Neural Networks and Pricing

One of the substantial challenges in marketing efforts is determining optimal markets, specifically in market segmentation. The problem is more controversial in electronic commerce and electronic marketing. Consumer behaviour is influenced by different factors and thus varies in different time periods. These dynamic impacts lead to the uncertain behaviour of consumers and therefore harden the t...

متن کامل

Robust inter and intra-cell layouts design model dealing with stochastic dynamic problems

In this paper, a novel quadratic assignment-based mathematical model is developed for concurrent design of robust inter and intra-cell layouts in dynamic stochastic environments of manufacturing systems. In the proposed model, in addition to considering time value of money, the product demands are presumed to be dependent normally distributed random variables with known expectation, variance, a...

متن کامل

Optimization Model of Hirmand River Basin Water Resources in the Agricultural Sector Using Stochastic Dynamic Programming under Uncertainty Conditions

In this study, water management allocated to the agricultural sector’ was analyzed using stochastic dynamic programming under uncertainty conditions. The technical coefficients used in the study referred to the agricultural years, 2013-2014. They were obtained through the use of simple random sampling of 250 farmers in the region for crops wheat, barley, melon, watermelon and ruby grapes under ...

متن کامل

A Defined Benefit Pension Fund ALM Model through Multistage Stochastic Programming

We consider an asset-liability management (ALM) problem for a defined benefit pension fund (PF). The PF manager is assumed to follow a maximal fund valuation problem facing an extended set of risk factors: due to the longevity of the PF members, the inflation affecting salaries in real terms and future incomes, interest rates and market factors affecting jointly the PF liability and asset p...

متن کامل

Dynamic Obstacle Avoidance by Distributed Algorithm based on Reinforcement Learning (RESEARCH NOTE)

In this paper we focus on the application of reinforcement learning to obstacle avoidance in dynamic Environments in wireless sensor networks. A distributed algorithm based on reinforcement learning is developed for sensor networks to guide mobile robot through the dynamic obstacles. The sensor network models the danger of the area under coverage as obstacles, and has the property of adoption o...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

Artif. Intell.

دوره 72 شماره

صفحات -

تاریخ انتشار 1995

Learning to Act Using Real-Time Dynamic Programming

نویسندگان

چکیده

منابع مشابه

A DSS-Based Dynamic Programming for Finding Optimal Markets Using Neural Networks and Pricing

Robust inter and intra-cell layouts design model dealing with stochastic dynamic problems

Optimization Model of Hirmand River Basin Water Resources in the Agricultural Sector Using Stochastic Dynamic Programming under Uncertainty Conditions

A Defined Benefit Pension Fund ALM Model through Multistage Stochastic Programming

Dynamic Obstacle Avoidance by Distributed Algorithm based on Reinforcement Learning (RESEARCH NOTE)

عنوان ژورنال:

اشتراک گذاری